Skip to content

Nth attempt to resolve port collisions once-and-for-all#9850

Open
kaleb-himes wants to merge 1 commit intowolfSSL:masterfrom
kaleb-himes:p-collide-nth-solve
Open

Nth attempt to resolve port collisions once-and-for-all#9850
kaleb-himes wants to merge 1 commit intowolfSSL:masterfrom
kaleb-himes:p-collide-nth-solve

Conversation

@kaleb-himes
Copy link
Contributor

@kaleb-himes kaleb-himes commented Mar 2, 2026

Description

I've been seeing port collisions in Jenkins again in spite of bwrap having worked for us for many years now. This is an attempt to further avoid the probability of port collisions in the make check scripts that use random port generation.

This solution introduces the concept of "remembering ports assigned" in addition to checking "already assigned ports on the system".

Testing:

Many many cycles running openssl/ocsp scripts in tight loops. Probability of collisions in the worst case script (openssl) is estimated to have been:

# 16023 == port range
# 11 == 5x openssl servers & 6x wolfSSL servers in 

  P ≈ 1 − e^(−(11×10) / (2 × 16023))
    = 1 − e^(−110 / 32046)
    = 1 − e^(−0.003433)
    ≈ 0.343% per run

  Expected runs before seeing one collision: 1 / 0.00343 ≈ 292 runs

1 collision per 292 runs of the openssl script (we test more runs than that in each PR due to the many config options)

With this proposed change the probability is estimated to drop from 0.343% to 0% inside a single run of a script and ~0.006% intra-script (multiple copies of the script executing on the same machine). The new logic checks for already-handed-out ports on the machine but there is a small probability remaining that port is free when checked and no longer free by the time it gets used if another script grabs the same one.

I left the test scripts running all weekend without the fix and with the fix.

In the shell running the old code we saw 52 collisions over the course of 48 hours.
In the shell running with these changes in place we saw 0 collisions over the course of 48 hours.

I also created two "simulation" scripts to run many more tests in 10 minutes intervals then can be achieved with the actual live TLS connections in 48 hours and these are those results:

OLD SOLUTION w/ bwrap in place, using /dev/random for port values with no memory of assigned ports:

========================================================================
  wolfSSL port-collision demo  —  OLD generate_port() (no dedup)
========================================================================
  Servers per simulated run : 11
  Port range                : 49512 – 65534  (N=16023)
  Theoretical collision prob: 0.343% per run
  Timeout                   : 48h  (until 2026-03-04 15:17:39)
========================================================================

  elapsed             runs          collisions    obs-rate      theory-rate

  +10m00s              297522        987           0.331%        0.343%

NEW SOLUTION w/ bwrap in place, using /dev/random for port values and having memory of assigned ports:

========================================================================
  wolfSSL port-collision demo  —  NEW generate_port() (with dedup)
========================================================================
  Servers per simulated run : 11
  Port range                : 49512 – 65534  (N=16023)
  Old method theory rate    : 0.343% per run  (for comparison)
  Expected new method rate  : 0.000%  (guaranteed by used_ports[])
  System port check (ss)    : skipped (simulation — no ports bound; set SIMULATE_ONLY=0 to enable)
  Timeout                   : 48h  (until 2026-03-04 15:17:39)
========================================================================

  elapsed             runs          collisions    status

  +10m00s              249485        0             OK

Checklist

  • added tests
  • updated/added doxygen
  • updated appropriate READMEs
  • Updated manual and documentation

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to eliminate flaky port collisions in CI by improving the random-port selection logic in multiple make check scripts, adding intra-script deduplication and a best-effort check against ports already bound on the system.

Changes:

  • Add used_ports tracking to ensure ports handed out within a single script run are unique.
  • Enhance generate_port() to retry on collisions and (when available) detect already-bound ports via ss/netstat.
  • Apply the updated generate_port() logic across several OpenSSL/OCSP-related test scripts.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
scripts/rsapss.test Adds per-run port dedup + system-bound port check in generate_port()
scripts/openssl_srtp.test Adds per-run port dedup + system-bound port check in generate_port()
scripts/openssl.test Adds per-run port dedup + system-bound port check in generate_port()
scripts/ocsp-stapling_tls13multi.test Adds per-run port dedup + system-bound port check in generate_port()
scripts/ocsp-stapling2.test Adds per-run port dedup + system-bound port check in generate_port()
scripts/ocsp-stapling.test Adds per-run port dedup + system-bound port check in generate_port()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


[ $collision -eq 0 ] && break

((attempts++))
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this script set -e is enabled, and ((attempts++)) will return a non-zero status the first time it runs (because the arithmetic expression evaluates to 0 before increment). That can cause the whole test script to exit early on the first detected collision instead of retrying. Use an increment form that won’t trip set -e (e.g., attempts=$((attempts + 1))), or explicitly ignore the status (e.g., ((attempts++)) || true).

Suggested change
((attempts++))
attempts=$((attempts + 1))

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants